Skip to content

fix: avoid eagerly copying uploaded files into memory#6174

Merged
masenf merged 5 commits intoreflex-dev:mainfrom
FarhanAliRaza:memory-issue
Mar 16, 2026
Merged

fix: avoid eagerly copying uploaded files into memory#6174
masenf merged 5 commits intoreflex-dev:mainfrom
FarhanAliRaza:memory-issue

Conversation

@FarhanAliRaza
Copy link
Copy Markdown
Collaborator

Previously the upload handler read every file into a BytesIO buffer before passing it to the event handler, doubling peak memory for large uploads. Now the original Starlette temp files are passed through directly and form_data.close() is deferred until the streaming response completes.

  • Pass file.file directly instead of copying into BytesIO
  • Defer form_data.close() to finally block in streaming generator
  • Close form_data on error path to prevent resource leaks
  • Replace mock form data in tests with real Starlette FormData
  • Add test verifying files are not read before handler runs

After these fixes, 4 GB will peak at about 4 GB of RAM when the user actually reads the file. Before there were two spikes, once when copying and once when the user read the file in its view.

All Submissions:

  • Have you followed the guidelines stated in CONTRIBUTING.md file?
  • Have you checked to ensure there aren't any other open Pull Requests for the desired changed?

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

closes #3517

Previously the upload handler read every file into a BytesIO buffer
before passing it to the event handler, doubling peak memory for large
uploads. Now the original Starlette temp files are passed through
directly and form_data.close() is deferred until the streaming response
completes.

- Pass file.file directly instead of copying into BytesIO
- Defer form_data.close() to finally block in streaming generator
- Close form_data on error path to prevent resource leaks
- Replace mock form data in tests with real Starlette FormData
- Add test verifying files are not read before handler runs
@FarhanAliRaza
Copy link
Copy Markdown
Collaborator Author

There is also another method to solve this issue. But that will require a more involved refactor. We kind of use Pheonix live view method. We upload the files in chunks from the frontend and the backend writes those chunks to files. That will be the nuclear option that can reduce Ram a lot. I have to fully research it . It sounds possible.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 13, 2026

Greptile Summary

This PR eliminates the peak-memory double-spike for large uploads by removing the eager BytesIO copy of every uploaded file and instead passing the original Starlette SpooledTemporaryFile reference directly to the event handler. form_data.close() is deferred to a finally block inside the _ndjson_updates streaming generator so the temp files remain alive until the handler has finished reading them, with an explicit cleanup on the error path to prevent resource leaks.

Key changes:

  • file.file is now passed directly to UploadFile(...) instead of first calling await file.read() and copying into a BytesIO; peak RAM for a 4 GB upload drops from ~8 GB to ~4 GB.
  • form_data.close() is deferred to a try/finally inside the streaming generator so it executes after the handler completes, not before.
  • An except Exception: await form_data.close(); raise guard ensures cleanup on any error that prevents the streaming generator from ever starting.
  • Tests updated to use real Starlette FormData instead of mock objects, and a new test (test_upload_file_keeps_form_open_until_stream_completes) asserts that neither file.read nor form_data.close is called until the response body is consumed.

Minor concerns:

  • The existing test_upload_file constructs both UploadFile objects from the same io.BytesIO that was left at EOF after bio.write(data), so the handler silently reads empty bytes for both files. This is a pre-existing issue made more visible now that the old seek(0) on the copy is gone; a bio.seek(0) before wrapping would make the test more realistic.
  • except Exception does not catch asyncio.CancelledError (a BaseException in Python ≥ 3.8), leaving a narrow window where form_data could leak if the request task is cancelled between _create_upload_event returning and the streaming generator starting.

Confidence Score: 4/5

  • This PR is safe to merge with minor caveats around task-cancellation cleanup and a pre-existing test data setup issue.
  • The core logic is correct: Starlette temp files are properly kept alive until the handler finishes reading them, and explicit cleanup happens in both the happy path (finally block) and the error path (except block). The only meaningful risk is a narrow resource-leak window on asyncio task cancellation that except Exception doesn't cover, plus a pre-existing test deficiency where files are read from EOF. No regressions to existing behavior are introduced.
  • reflex/app.py lines 1993-1997 (except Exception cancellation gap); tests/units/test_app.py lines 958-960 (bio at EOF)

Important Files Changed

Filename Overview
reflex/app.py Upload handler refactored to pass Starlette temp files directly (avoiding BytesIO copy), with form_data lifetime deferred to the streaming generator's finally block; minor resource-leak risk on task cancellation in the error-path except clause.
tests/units/test_app.py Mock form data replaced with real Starlette FormData objects; new test verifies lazy file access and deferred close; existing test_upload_file constructs both UploadFiles with bio at EOF, so handler always reads empty bytes (pre-existing, but more visible now that the copy+seek(0) is gone).

Sequence Diagram

sequenceDiagram
    participant Client
    participant UploadRoute as upload_file (Starlette route)
    participant CreateEvent as _create_upload_event()
    participant NdjsonGen as _ndjson_updates() generator
    participant Handler as State event handler
    participant FormData as form_data (Starlette temp files)

    Client->>UploadRoute: POST /upload (multipart)
    UploadRoute->>FormData: await request.form()
    FormData-->>UploadRoute: form_data (SpooledTempFile refs)

    UploadRoute->>CreateEvent: await _create_upload_event()
    CreateEvent->>FormData: form_data.getlist("files")
    Note over CreateEvent: wraps file.file directly<br/>(no BytesIO copy)
    CreateEvent-->>UploadRoute: Event(payload={files: [UploadFile(file=file.file)]})

    alt _create_upload_event raises
        UploadRoute->>FormData: await form_data.close()  [cleanup]
        UploadRoute-->>Client: raise exception
    else success
        UploadRoute-->>Client: StreamingResponse(_ndjson_updates())
        loop body_iterator consumed
            NdjsonGen->>Handler: state._process(event)
            Handler->>FormData: await file.read()  [reads temp file]
            Handler-->>NdjsonGen: state updates
            NdjsonGen-->>Client: yield update JSON
        end
        NdjsonGen->>FormData: await form_data.close()  [finally block]
        Note over FormData: SpooledTempFile deleted from disk
    end
Loading

Comments Outside Diff (1)

  1. tests/units/test_app.py, line 958-974 (link)

    bio left at EOF before being wrapped in UploadFile

    bio.write(data) leaves the internal position at the end of the buffer (offset 19). Both file1 and file2 share this same BytesIO object, so the event handler will always read empty bytes from both files.

    In the old code the framework copied the content (content_copy.write(await file.read()) → empty → seek(0)) before handing copies to the handler, so the test passed but was already silently broken. With the new code, the raw file.file reference is passed through unchanged, making this even more visible.

    The test still passes because multi_handle_upload only calls self.img_list.append(file.name) regardless of content, but the file written to tmp_path will be empty for every parameterized variant.

    Consider adding bio.seek(0) after bio.write(data) so the test actually exercises a realistic file-read path:

Last reviewed commit: a1961f1

Comment thread reflex/app.py Outdated
Comment on lines +1993 to +1997
try:
event = await _create_upload_event()
except Exception:
await form_data.close()
raise
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseException subclasses bypass form_data cleanup

except Exception does not catch BaseException subclasses such as asyncio.CancelledError (in Python ≥ 3.8 it is a BaseException, not an Exception). If the task is cancelled between _create_upload_event() returning and _ndjson_updates being entered, form_data is never closed and the underlying temp file / spooled file leaks.

A try / finally approach avoids the gap, since the success path hands ownership to _ndjson_updates:

event: Event | None = None
try:
    event = await _create_upload_event()
finally:
    if event is None:
        await form_data.close()

Alternatively, except BaseException with a raise covers cancellation as well.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq bot commented Mar 13, 2026

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing FarhanAliRaza:memory-issue (70c199b) with main (7607fa3)

Open in CodSpeed

Copy link
Copy Markdown
Collaborator

@masenf masenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. I think the greptile comment is onto something, but instead of a finally, which would be detremental, just catch BaseException instead.

The test improvements are nice, I just had one suggestion for improving the assertions on the existing test case we had.

Comment thread tests/units/test_app.py Outdated
assert not bio.closed

async for _ in streaming_response.body_iterator:
pass
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should assert not bio.closed for each iteration of this loop

Comment thread tests/units/test_app.py
files_mock.getlist = getlist

return files_mock
return FormData([("files", file1), ("files", file2)])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you extend this test:

  1. use actual different BinaryIO instances for each file
  2. set a _tmp_path in the file upload state using app.modify_state (instead of the current get_state). This path is where the handler will actually write the data that it reads from the "files"
  3. update multi_handle_upload to yield at the end of processing each UploadFile
  4. each iteration of the streaming loop should assert that both BinaryIO handles remain unclosed
  5. at the end of processing, also check that the files got written to the tmp dir correctly and that both BinaryIO handles got closed

- Extract _FileUploadMixin to eliminate duplicated fields and methods
  across FileUploadState, ChildFileUploadState, and GrandChildFileUploadState
- Fix shared BytesIO bug where both UploadFile instances shared one
  buffer, causing the second file to write empty content
- Use app.modify_state to set _tmp_path on the state instance instead
  of setting class attributes directly
- Assert incremental delta updates during streaming (yield per file)
- Verify both BinaryIO handles remain open during streaming and close
  after form_data.close()
- Verify uploaded files are written to tmp dir with correct content
- Remove dead state._tmp_path assignments from error-case tests
Replace `except Exception` with `try/finally` + `event is None` guard so
that `asyncio.CancelledError` (a BaseException) also triggers form_data
cleanup.

Add idempotent `_close_form_data()` helper to safely deduplicate cleanup
across three paths: event creation failure, generator completion, and
ASGI response teardown.

Introduce `_UploadStreamingResponse` (module-level) that wraps
`__call__` in try/finally as a safety net for the case where the
response is cancelled before the generator is ever entered.

Add tests verifying form_data.close() is called on:
- CancelledError during event creation (get_state cancelled)
- CancelledError during ASGI send (before generator iteration)
@FarhanAliRaza
Copy link
Copy Markdown
Collaborator Author

I have fixed the reviews. _UploadStreamingResponse finally can be reverted, it will not affect 99% of request it is just extra safety.

Copy link
Copy Markdown
Collaborator

@masenf masenf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion based on your last comment.

i like the new approach of baking the close into the streaming response.

Comment thread reflex/app.py Outdated
Comment on lines +2038 to +2048
try:
# Process the event.
async with app.state_manager.modify_state_with_links(
event.substate_token
) as state:
async for update in state._process(event):
# Postprocess the event.
update = await app._postprocess(state, event, update)
yield update.json() + "\n"
finally:
await _close_form_data()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i think cleaning up this path since we handle it with _UploadStreamingResponse makes the code more maintainable

Suggested change
try:
# Process the event.
async with app.state_manager.modify_state_with_links(
event.substate_token
) as state:
async for update in state._process(event):
# Postprocess the event.
update = await app._postprocess(state, event, update)
yield update.json() + "\n"
finally:
await _close_form_data()
# Process the event.
async with app.state_manager.modify_state_with_links(
event.substate_token
) as state:
async for update in state._process(event):
# Postprocess the event.
update = await app._postprocess(state, event, update)
yield update.json() + "\n"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@FarhanAliRaza
Copy link
Copy Markdown
Collaborator Author

FarhanAliRaza commented Mar 14, 2026

Integration tests pass locally . The test is maybe flaky.

@masenf masenf merged commit fe34ae8 into reflex-dev:main Mar 16, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

File upload appears to hold everything in memory until the end

2 participants